Reinforcement Learning with Particle Swarm Optimization Policy (PSO-P) in Continuous State and Action Spaces
نویسندگان
چکیده
This article introduces a model-based reinforcement learning (RL) approach for continuous state and action spaces. While most RL methods try to find closed-form policies, the approach taken here employs numerical on-line optimization of control action sequences. First, a general method for reformulating RL problems as optimization tasks is provided. Subsequently, Particle Swarm Optimization (PSO) is applied to search for optimal solutions. This Particle Swarm Optimization Policy (PSO-P) is effective for high dimensional state spaces and does not require a priori assumptions about adequate policy representations. Furthermore, by translating RL problems into optimization tasks, the rich collection of real-world inspired RL benchmarks is made available for benchmarking numerical optimization techniques. The effectiveness of PSO-P is demonstrated on the two standard benchmarks: mountain car and cart pole. KeywORdS Benchmark, Cart Pole, Continuous Action Space, Continuous State Space, High-dimensional, Model-based, Mountain Car, Particle Swarm Optimization, Reinforcement Learning
منابع مشابه
Q-Value Based Particle Swarm Optimization for Reinforcement Neuro- Fuzzy System Design
This paper proposes a combination of particle swarm optimization (PSO) and Q-value based safe reinforcement learning scheme for neuro-fuzzy systems (NFS). The proposed Q-value based particle swarm optimization (QPSO) fulfills PSO-based NFS with reinforcement learning; that is, it provides PSO-based NFS an alternative to learn optimal control policies under environments where only weak reinforce...
متن کاملA new Reinforcement Learning-based Memetic Particle Swarm Optimizer
Developing an effective memetic algorithm that integrates the Particle Swarm Optimization (PSO) algorithm and a local search method is a difficult task. The challenging issues include when the local search method should be called, the frequency of calling the local search method, as well as which particle should undergo the local search operations. Motivated by this challenge, we introduce a ne...
متن کاملEnhanced Comprehensive Learning Cooperative Particle Swarm Optimization with Fuzzy Inertia Weight (ECLCFPSO-IW)
So far various methods for optimization presented and one of most popular of them are optimization algorithms based on swarm intelligence and also one of most successful of them is Particle Swarm Optimization (PSO). Prior some efforts by applying fuzzy logic for improving defects of PSO such as trapping in local optimums and early convergence has been done. Moreover to overcome the problem of i...
متن کاملParticle swarm optimization for generating interpretable fuzzy reinforcement learning policies
Fuzzy controllers are efficient and interpretable system controllers for continuous state and action spaces. To date, such controllers have been constructed manually or trained automatically either using expert-generated problem-specific cost functions or incorporating detailed knowledge about the optimal control strategy. Both requirements for automatic training processes are not found in most...
متن کاملA Particle Swarm Optimization derivative applied to cluster analysis
Modern machine learning and data analysis hinge on sophisticated search techniques. In general, exploration in high-dimensional and multi-modal spaces is needed. Some algorithms that imitate certain natural principles, the so-called evolutionary algorithms, have been used in different aspects of Environmental Science and have found numerous applications in Environmental related problems. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJSIR
دوره 7 شماره
صفحات -
تاریخ انتشار 2016